Skip to content

[Issue #1288] implement file rewriting to enable garbage collection at the storage layer and fix some visibility bugs#1310

Draft
gengdy1545 wants to merge 8 commits intopixelsdb:masterfrom
gengdy1545:feature/storageGC
Draft

[Issue #1288] implement file rewriting to enable garbage collection at the storage layer and fix some visibility bugs#1310
gengdy1545 wants to merge 8 commits intopixelsdb:masterfrom
gengdy1545:feature/storageGC

Conversation

@gengdy1545
Copy link
Copy Markdown
Collaborator

@gengdy1545 gengdy1545 commented Mar 21, 2026

Pixels uses an immutable columnar file + Visibility chain MVCC architecture: writes are append-only, and deletes only set marks in the in-memory TileVisibility (Deletion Chain + baseBitmap). The existing Memory GC reclaims expired Deletion Chain blocks in memory, but the physical files themselves never shrink. As deletes accumulate, "ghost rows" inflate storage and degrade scan performance. Storage GC addresses this by rewriting physical files with high deletion ratios, reclaiming storage space while guaranteeing that in-flight queries are not affected.

  • gcSnapshotBitmap Production: Extend Memory GC to produce precise deletion bitmaps at safeGcTs.
  • Checkpoint with Precomputed Bitmaps: Serialize gcSnapshotBitmaps directly to checkpoint file.
  • Scan & Group (S1) — invalidRatio Calculation: Identify candidate files and group them for rewrite.
  • Data Rewrite (S2): Read old files, filter out ghost rows, write new compact file.
  • Dual-Write (S3): Keep new/old Visibility in sync during the transition window.
  • Visibility Sync (S4): Migrate Deletion Chain items (ts > safeGcTs) from old files to new file.
  • Index Sync (S5): Update MainIndex and SinglePointIndex to point to new file locations.
  • Atomic Switch & Deferred Cleanup (S6): Commit the rewrite and safely retire old files.

@gengdy1545 gengdy1545 self-assigned this Mar 30, 2026
@gengdy1545 gengdy1545 added the enhancement New feature or request label Mar 30, 2026
@gengdy1545 gengdy1545 added this to the Real-time CRUD milestone Mar 30, 2026
@gengdy1545 gengdy1545 linked an issue Mar 30, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pixels-retina, pixels-index, cpp] create new file

1 participant